Multinomial Logit Contextual Bandits: Provable Optimality and Practicality
نویسندگان
چکیده
We consider a sequential assortment selection problem where the user choice is given by multinomial logit (MNL) model whose parameters are unknown. In each period, learning agent observes d-dimensional contextual information about and N available items, offers an of size K to user, bandit feedback item chosen from assortment. propose upper confidence bound based algorithms for this MNL bandit. The first algorithm simple practical method that achieves O(d√T) regret over T rounds. Next, we second which O(√dT) regret. This matches lower problem, up logarithmic terms, improves on best-known result √d factor. To establish sharper bound, present non-asymptotic maximum likelihood estimator may be independent interest as its own theoretical contribution. then revisit simpler, significantly more practical, show variant optimal broad class important applications.
منابع مشابه
The Generalized Multinomial Logit Model
The so-called “mixed” or “heterogeneous” multinomial logit (MIXL) model has become popular in a number of fields, especially Marketing, Health Economics and Industrial Organization. In most applications of the model, the vector of consumer utility weights on product attributes is assumed to have a multivariate normal (MVN) distribution in the population. Thus, some consumers care more about som...
متن کاملVariational Multinomial Logit Gaussian Process
Gaussian process prior with an appropriate likelihood function is a flexible non-parametric model for a variety of learning tasks. One important and standard task is multi-class classification, which is the categorization of an item into one of several fixed classes. A usual likelihood function for this is the multinomial logistic likelihood function. However, exact inference with this model ha...
متن کاملMultinomial logit random effects models
This article presents a general approach for logit random effects modelling of clustered ordinal and nominal responses. We review multinomial logit random effects models in a unified form as multivariate generalized linear mixed models. Maximum likelihood estimation utilizes adaptive Gauss–Hermite quadrature within a quasi-Newton maximization algorithm. For cases in which this is computationall...
متن کاملSemantic Scene Segmentation using Random Multinomial Logit
We introduce Random Multinomial Logit (RML), a general multi-class classifier based on an ensemble of multinomial logistic regression models, and apply it to the task of semantic image segmentation. The algorithm is simple, can be trained efficiently, and has near realtime runtime performance. RML combines the desirable properties of multinomial logistic regression, being stable and theoretical...
متن کاملKernalized Collaborative Contextual Bandits
We tackle the problem of recommending products in the online recommendation scenario, which occurs many times in real applications. The most famous and explored instances are news recommendations and advertisements. In this work we propose an extension to the state of the art Bandit models to not only take care of different users’ interactions, but also to go beyond the linearity assumption of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2021
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v35i10.17111